Overview

Dataset statistics

Number of variables40
Number of observations260601
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory178.9 MiB
Average record size in memory720.0 B

Variable types

BOOL22
CAT9
NUM9

Warnings

building_id has unique values Unique
geo_level_1_id has 4011 (1.5%) zeros Zeros
age has 26041 (10.0%) zeros Zeros
count_families has 20862 (8.0%) zeros Zeros

Reproduction

Analysis started2020-12-13 20:15:59.787358
Analysis finished2020-12-13 20:17:02.416349
Duration1 minute and 2.63 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

building_id
Real number (ℝ≥0)

UNIQUE

Distinct260601
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean525675.4828
Minimum4
Maximum1052934
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-12-13T17:17:02.577348image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile52114
Q1261190
median525757
Q3789762
95-th percentile1000724
Maximum1052934
Range1052930
Interquartile range (IQR)528572

Descriptive statistics

Standard deviation304544.999
Coefficient of variation (CV)0.5793403136
Kurtosis-1.203878964
Mean525675.4828
Median Absolute Deviation (MAD)264277
Skewness0.001882356737
Sum1.369915565e+11
Variance9.274765644e+10
MonotocityNot monotonic
2020-12-13T17:17:02.731376image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
10526701< 0.1%
 
8473041< 0.1%
 
3681021< 0.1%
 
7299861< 0.1%
 
9005781< 0.1%
 
8964801< 0.1%
 
8084151< 0.1%
 
8125051< 0.1%
 
2902641< 0.1%
 
2697821< 0.1%
 
7258921< 0.1%
 
2656841< 0.1%
 
2759231< 0.1%
 
8411671< 0.1%
 
351821< 0.1%
 
3004871< 0.1%
 
5724011< 0.1%
 
8268221< 0.1%
 
8206771< 0.1%
 
3107221< 0.1%
 
6936171< 0.1%
 
4766051< 0.1%
 
4602131< 0.1%
 
4724991< 0.1%
 
9988341< 0.1%
 
Other values (260576)260576> 99.9%
 
ValueCountFrequency (%) 
41< 0.1%
 
81< 0.1%
 
121< 0.1%
 
161< 0.1%
 
171< 0.1%
 
251< 0.1%
 
281< 0.1%
 
311< 0.1%
 
341< 0.1%
 
361< 0.1%
 
ValueCountFrequency (%) 
10529341< 0.1%
 
10529311< 0.1%
 
10529291< 0.1%
 
10529261< 0.1%
 
10529211< 0.1%
 
10529151< 0.1%
 
10529111< 0.1%
 
10529091< 0.1%
 
10529081< 0.1%
 
10529061< 0.1%
 

geo_level_1_id
Real number (ℝ≥0)

ZEROS

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.90035341
Minimum0
Maximum30
Zeros4011
Zeros (%)1.5%
Memory size2.0 MiB
2020-12-13T17:17:02.868353image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q17
median12
Q321
95-th percentile27
Maximum30
Range30
Interquartile range (IQR)14

Descriptive statistics

Standard deviation8.033616625
Coefficient of variation (CV)0.5779433361
Kurtosis-1.213248785
Mean13.90035341
Median Absolute Deviation (MAD)6
Skewness0.2725303548
Sum3622446
Variance64.53899608
MonotocityNot monotonic
2020-12-13T17:17:02.975348image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%) 
6243819.4%
 
26226158.7%
 
10220798.5%
 
17218138.4%
 
8190807.3%
 
7189947.3%
 
20172166.6%
 
21148895.7%
 
4145685.6%
 
27125324.8%
 
1396083.7%
 
1182203.2%
 
375402.9%
 
2262522.4%
 
2556242.2%
 
1643321.7%
 
040111.5%
 
939581.5%
 
1231941.2%
 
1831891.2%
 
127011.0%
 
526901.0%
 
3026861.0%
 
1523200.9%
 
1417140.7%
 
Other values (6)43951.7%
 
ValueCountFrequency (%) 
040111.5%
 
127011.0%
 
29310.4%
 
375402.9%
 
4145685.6%
 
526901.0%
 
6243819.4%
 
7189947.3%
 
8190807.3%
 
939581.5%
 
ValueCountFrequency (%) 
3026861.0%
 
293960.2%
 
282650.1%
 
27125324.8%
 
26226158.7%
 
2556242.2%
 
2413100.5%
 
2311210.4%
 
2262522.4%
 
21148895.7%
 

geo_level_2_id
Real number (ℝ≥0)

Distinct1414
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean701.0746851
Minimum0
Maximum1427
Zeros38
Zeros (%)< 0.1%
Memory size2.0 MiB
2020-12-13T17:17:03.105349image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile69
Q1350
median702
Q31050
95-th percentile1377
Maximum1427
Range1427
Interquartile range (IQR)700

Descriptive statistics

Standard deviation412.7107336
Coefficient of variation (CV)0.5886829782
Kurtosis-1.188232475
Mean701.0746851
Median Absolute Deviation (MAD)349
Skewness0.02895738139
Sum182700764
Variance170330.1496
MonotocityNot monotonic
2020-12-13T17:17:03.241350image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
3940381.5%
 
15825201.0%
 
18120800.8%
 
138720400.8%
 
15718970.7%
 
36317600.7%
 
46317400.7%
 
67317040.7%
 
53316840.6%
 
88316260.6%
 
139415370.6%
 
54814970.6%
 
100614500.6%
 
72013590.5%
 
99111450.4%
 
100111350.4%
 
88911140.4%
 
76510910.4%
 
125310900.4%
 
115510690.4%
 
140110630.4%
 
88610530.4%
 
15110430.4%
 
66010410.4%
 
13110380.4%
 
Other values (1389)22178785.1%
 
ValueCountFrequency (%) 
038< 0.1%
 
12040.1%
 
377< 0.1%
 
43150.1%
 
525< 0.1%
 
62< 0.1%
 
7100< 0.1%
 
8120< 0.1%
 
93330.1%
 
103540.1%
 
ValueCountFrequency (%) 
14276< 0.1%
 
14262860.1%
 
14254660.2%
 
14247< 0.1%
 
14233< 0.1%
 
14222160.1%
 
14212540.1%
 
142010< 0.1%
 
141995< 0.1%
 
14181520.1%
 

geo_level_3_id
Real number (ℝ≥0)

Distinct11595
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6257.876148
Minimum0
Maximum12567
Zeros2
Zeros (%)< 0.1%
Memory size2.0 MiB
2020-12-13T17:17:03.403351image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile611
Q13073
median6270
Q39412
95-th percentile11927
Maximum12567
Range12567
Interquartile range (IQR)6339

Descriptive statistics

Standard deviation3646.369645
Coefficient of variation (CV)0.5826848532
Kurtosis-1.213896506
Mean6257.876148
Median Absolute Deviation (MAD)3171
Skewness0.0003935120899
Sum1630808782
Variance13296011.59
MonotocityNot monotonic
2020-12-13T17:17:03.544398image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
6336510.2%
 
91336470.2%
 
6215300.2%
 
112464700.2%
 
20054660.2%
 
114404550.2%
 
77234430.2%
 
92293810.1%
 
24523490.1%
 
122583120.1%
 
82363030.1%
 
104453020.1%
 
21702830.1%
 
66262830.1%
 
25372590.1%
 
852520.1%
 
4062510.1%
 
69732480.1%
 
78682470.1%
 
39042410.1%
 
102212370.1%
 
107952360.1%
 
18512360.1%
 
113192300.1%
 
107282280.1%
 
Other values (11570)25206196.7%
 
ValueCountFrequency (%) 
02< 0.1%
 
16< 0.1%
 
39< 0.1%
 
514< 0.1%
 
621< 0.1%
 
72< 0.1%
 
831< 0.1%
 
93< 0.1%
 
101< 0.1%
 
1162< 0.1%
 
ValueCountFrequency (%) 
125671< 0.1%
 
125657< 0.1%
 
125646< 0.1%
 
1256324< 0.1%
 
125623< 0.1%
 
1256119< 0.1%
 
1256017< 0.1%
 
125596< 0.1%
 
125586< 0.1%
 
1255744< 0.1%
 

count_floors_pre_eq
Real number (ℝ≥0)

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.129723217
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-12-13T17:17:03.662383image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q32
95-th percentile3
Maximum9
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.7276645453
Coefficient of variation (CV)0.3416709456
Kurtosis2.322597881
Mean2.129723217
Median Absolute Deviation (MAD)0
Skewness0.8341129586
Sum555008
Variance0.5294956905
MonotocityNot monotonic
2020-12-13T17:17:03.757396image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
215662360.1%
 
35561721.3%
 
14044115.5%
 
454242.1%
 
522460.9%
 
62090.1%
 
739< 0.1%
 
91< 0.1%
 
81< 0.1%
 
ValueCountFrequency (%) 
14044115.5%
 
215662360.1%
 
35561721.3%
 
454242.1%
 
522460.9%
 
62090.1%
 
739< 0.1%
 
81< 0.1%
 
91< 0.1%
 
ValueCountFrequency (%) 
91< 0.1%
 
81< 0.1%
 
739< 0.1%
 
62090.1%
 
522460.9%
 
454242.1%
 
35561721.3%
 
215662360.1%
 
14044115.5%
 

age
Real number (ℝ≥0)

ZEROS

Distinct42
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.53502865
Minimum0
Maximum995
Zeros26041
Zeros (%)10.0%
Memory size2.0 MiB
2020-12-13T17:17:03.885029image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q110
median15
Q330
95-th percentile60
Maximum995
Range995
Interquartile range (IQR)20

Descriptive statistics

Standard deviation73.56593652
Coefficient of variation (CV)2.772408408
Kurtosis157.2482363
Mean26.53502865
Median Absolute Deviation (MAD)10
Skewness12.19249422
Sum6915055
Variance5411.947016
MonotocityNot monotonic
2020-12-13T17:17:04.019033image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%) 
103889614.9%
 
153601013.8%
 
53369712.9%
 
203218212.3%
 
02604110.0%
 
25243669.3%
 
30180286.9%
 
35107104.1%
 
40105594.1%
 
5072572.8%
 
4547111.8%
 
6036121.4%
 
8030551.2%
 
5520330.8%
 
7019750.8%
 
99513900.5%
 
10013640.5%
 
6511230.4%
 
9010850.4%
 
858470.3%
 
755120.2%
 
954140.2%
 
1201800.1%
 
1501420.1%
 
200106< 0.1%
 
Other values (17)3060.1%
 
ValueCountFrequency (%) 
02604110.0%
 
53369712.9%
 
103889614.9%
 
153601013.8%
 
203218212.3%
 
25243669.3%
 
30180286.9%
 
35107104.1%
 
40105594.1%
 
4547111.8%
 
ValueCountFrequency (%) 
99513900.5%
 
200106< 0.1%
 
1952< 0.1%
 
1903< 0.1%
 
1851< 0.1%
 
1807< 0.1%
 
1755< 0.1%
 
1706< 0.1%
 
1652< 0.1%
 
1606< 0.1%
 

area_percentage
Real number (ℝ≥0)

Distinct84
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.018050583
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-12-13T17:17:04.156600image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q15
median7
Q39
95-th percentile16
Maximum100
Range99
Interquartile range (IQR)4

Descriptive statistics

Standard deviation4.392230936
Coefficient of variation (CV)0.5477928694
Kurtosis30.43825794
Mean8.018050583
Median Absolute Deviation (MAD)2
Skewness3.526082314
Sum2089512
Variance19.29169259
MonotocityNot monotonic
2020-12-13T17:17:04.290556image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
64201316.1%
 
73675214.1%
 
53272412.6%
 
82844510.9%
 
9221998.5%
 
4192367.4%
 
10156136.0%
 
11139075.3%
 
3118374.5%
 
1275812.9%
 
1358152.2%
 
1441621.6%
 
1534891.3%
 
231811.2%
 
1626061.0%
 
1724891.0%
 
1916020.6%
 
1813170.5%
 
2010530.4%
 
238650.3%
 
216450.2%
 
244050.2%
 
223910.2%
 
252600.1%
 
262470.1%
 
Other values (59)17670.7%
 
ValueCountFrequency (%) 
190< 0.1%
 
231811.2%
 
3118374.5%
 
4192367.4%
 
53272412.6%
 
64201316.1%
 
73675214.1%
 
82844510.9%
 
9221998.5%
 
10156136.0%
 
ValueCountFrequency (%) 
1001< 0.1%
 
963< 0.1%
 
901< 0.1%
 
865< 0.1%
 
854< 0.1%
 
843< 0.1%
 
833< 0.1%
 
821< 0.1%
 
801< 0.1%
 
781< 0.1%
 

height_percentage
Real number (ℝ≥0)

Distinct27
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.434365179
Minimum2
Maximum32
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-12-13T17:17:04.411595image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile3
Q14
median5
Q36
95-th percentile9
Maximum32
Range30
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.918418221
Coefficient of variation (CV)0.3530160667
Kurtosis14.31852616
Mean5.434365179
Median Absolute Deviation (MAD)1
Skewness1.808261757
Sum1416201
Variance3.68032847
MonotocityNot monotonic
2020-12-13T17:17:04.510975image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%) 
57851330.1%
 
64647717.8%
 
43776314.5%
 
73546513.6%
 
32595710.0%
 
8139025.3%
 
293053.6%
 
953762.1%
 
1044921.7%
 
119170.4%
 
129070.3%
 
137590.3%
 
152920.1%
 
161790.1%
 
3275< 0.1%
 
1871< 0.1%
 
1466< 0.1%
 
2033< 0.1%
 
2113< 0.1%
 
2311< 0.1%
 
179< 0.1%
 
197< 0.1%
 
244< 0.1%
 
253< 0.1%
 
262< 0.1%
 
Other values (2)3< 0.1%
 
ValueCountFrequency (%) 
293053.6%
 
32595710.0%
 
43776314.5%
 
57851330.1%
 
64647717.8%
 
73546513.6%
 
8139025.3%
 
953762.1%
 
1044921.7%
 
119170.4%
 
ValueCountFrequency (%) 
3275< 0.1%
 
311< 0.1%
 
282< 0.1%
 
262< 0.1%
 
253< 0.1%
 
244< 0.1%
 
2311< 0.1%
 
2113< 0.1%
 
2033< 0.1%
 
197< 0.1%
 
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
t
216757 
n
35528 
o
 
8316
ValueCountFrequency (%) 
t21675783.2%
 
n3552813.6%
 
o83163.2%
 
2020-12-13T17:17:04.653585image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-13T17:17:04.882646image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:17:04.980605image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters3
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
t21675783.2%
 
n3552813.6%
 
o83163.2%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter260601100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
t21675783.2%
 
n3552813.6%
 
o83163.2%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin260601100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
t21675783.2%
 
n3552813.6%
 
o83163.2%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII260601100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
t21675783.2%
 
n3552813.6%
 
o83163.2%
 

foundation_type
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
r
219196 
w
 
15118
u
 
14260
i
 
10579
h
 
1448
ValueCountFrequency (%) 
r21919684.1%
 
w151185.8%
 
u142605.5%
 
i105794.1%
 
h14480.6%
 
2020-12-13T17:17:05.094219image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-13T17:17:05.174227image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:17:05.295223image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters5
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
r21919684.1%
 
w151185.8%
 
u142605.5%
 
i105794.1%
 
h14480.6%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter260601100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
r21919684.1%
 
w151185.8%
 
u142605.5%
 
i105794.1%
 
h14480.6%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin260601100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
r21919684.1%
 
w151185.8%
 
u142605.5%
 
i105794.1%
 
h14480.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII260601100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
r21919684.1%
 
w151185.8%
 
u142605.5%
 
i105794.1%
 
h14480.6%
 

roof_type
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
n
182842 
q
61576 
x
 
16183
ValueCountFrequency (%) 
n18284270.2%
 
q6157623.6%
 
x161836.2%
 
2020-12-13T17:17:05.412326image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-13T17:17:05.511145image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:17:05.609334image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters3
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
n18284270.2%
 
q6157623.6%
 
x161836.2%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter260601100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n18284270.2%
 
q6157623.6%
 
x161836.2%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin260601100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
n18284270.2%
 
q6157623.6%
 
x161836.2%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII260601100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
n18284270.2%
 
q6157623.6%
 
x161836.2%
 
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
f
209619 
x
24877 
v
24593 
z
 
1004
m
 
508
ValueCountFrequency (%) 
f20961980.4%
 
x248779.5%
 
v245939.4%
 
z10040.4%
 
m5080.2%
 
2020-12-13T17:17:05.720528image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-13T17:17:05.799554image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:17:05.919316image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters5
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
f20961980.4%
 
x248779.5%
 
v245939.4%
 
z10040.4%
 
m5080.2%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter260601100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
f20961980.4%
 
x248779.5%
 
v245939.4%
 
z10040.4%
 
m5080.2%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin260601100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
f20961980.4%
 
x248779.5%
 
v245939.4%
 
z10040.4%
 
m5080.2%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII260601100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
f20961980.4%
 
x248779.5%
 
v245939.4%
 
z10040.4%
 
m5080.2%
 

other_floor_type
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
q
165282 
x
43448 
j
39843 
s
 
12028
ValueCountFrequency (%) 
q16528263.4%
 
x4344816.7%
 
j3984315.3%
 
s120284.6%
 
2020-12-13T17:17:06.037356image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-13T17:17:06.126212image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:17:06.230815image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
q16528263.4%
 
x4344816.7%
 
j3984315.3%
 
s120284.6%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter260601100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
q16528263.4%
 
x4344816.7%
 
j3984315.3%
 
s120284.6%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin260601100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
q16528263.4%
 
x4344816.7%
 
j3984315.3%
 
s120284.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII260601100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
q16528263.4%
 
x4344816.7%
 
j3984315.3%
 
s120284.6%
 

position
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
s
202090 
t
42896 
j
 
13282
o
 
2333
ValueCountFrequency (%) 
s20209077.5%
 
t4289616.5%
 
j132825.1%
 
o23330.9%
 
2020-12-13T17:17:06.346407image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-13T17:17:06.437346image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:17:06.550231image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
s20209077.5%
 
t4289616.5%
 
j132825.1%
 
o23330.9%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter260601100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
s20209077.5%
 
t4289616.5%
 
j132825.1%
 
o23330.9%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin260601100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
s20209077.5%
 
t4289616.5%
 
j132825.1%
 
o23330.9%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII260601100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
s20209077.5%
 
t4289616.5%
 
j132825.1%
 
o23330.9%
 
Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
d
250072 
q
 
5692
u
 
3649
s
 
346
c
 
325
Other values (5)
 
517
ValueCountFrequency (%) 
d25007296.0%
 
q56922.2%
 
u36491.4%
 
s3460.1%
 
c3250.1%
 
a2520.1%
 
o1590.1%
 
m46< 0.1%
 
n38< 0.1%
 
f22< 0.1%
 
2020-12-13T17:17:06.662185image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-13T17:17:06.749175image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:17:06.933214image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters10
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
d25007296.0%
 
q56922.2%
 
u36491.4%
 
s3460.1%
 
c3250.1%
 
a2520.1%
 
o1590.1%
 
m46< 0.1%
 
n38< 0.1%
 
f22< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter260601100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
d25007296.0%
 
q56922.2%
 
u36491.4%
 
s3460.1%
 
c3250.1%
 
a2520.1%
 
o1590.1%
 
m46< 0.1%
 
n38< 0.1%
 
f22< 0.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin260601100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
d25007296.0%
 
q56922.2%
 
u36491.4%
 
s3460.1%
 
c3250.1%
 
a2520.1%
 
o1590.1%
 
m46< 0.1%
 
n38< 0.1%
 
f22< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII260601100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
d25007296.0%
 
q56922.2%
 
u36491.4%
 
s3460.1%
 
c3250.1%
 
a2520.1%
 
o1590.1%
 
m46< 0.1%
 
n38< 0.1%
 
f22< 0.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
237500 
1
 
23101
ValueCountFrequency (%) 
023750091.1%
 
1231018.9%
 
2020-12-13T17:17:07.004277image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
1
198561 
0
62040 
ValueCountFrequency (%) 
119856176.2%
 
06204023.8%
 
2020-12-13T17:17:07.058228image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
251654 
1
 
8947
ValueCountFrequency (%) 
025165496.6%
 
189473.4%
 
2020-12-13T17:17:07.108093image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
255849 
1
 
4752
ValueCountFrequency (%) 
025584998.2%
 
147521.8%
 
2020-12-13T17:17:07.156588image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
242840 
1
 
17761
ValueCountFrequency (%) 
024284093.2%
 
1177616.8%
 
2020-12-13T17:17:07.212596image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
240986 
1
 
19615
ValueCountFrequency (%) 
024098692.5%
 
1196157.5%
 
2020-12-13T17:17:07.266355image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
194151 
1
66450 
ValueCountFrequency (%) 
019415174.5%
 
16645025.5%
 
2020-12-13T17:17:07.316408image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
238447 
1
 
22154
ValueCountFrequency (%) 
023844791.5%
 
1221548.5%
 
2020-12-13T17:17:07.369395image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
249502 
1
 
11099
ValueCountFrequency (%) 
024950295.7%
 
1110994.3%
 
2020-12-13T17:17:07.417244image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
256468 
1
 
4133
ValueCountFrequency (%) 
025646898.4%
 
141331.6%
 
2020-12-13T17:17:07.464252image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
256696 
1
 
3905
ValueCountFrequency (%) 
025669698.5%
 
139051.5%
 
2020-12-13T17:17:07.520209image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
v
250939 
a
 
5512
w
 
2677
r
 
1473
ValueCountFrequency (%) 
v25093996.3%
 
a55122.1%
 
w26771.0%
 
r14730.6%
 
2020-12-13T17:17:07.636320image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-13T17:17:07.749322image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:17:07.887320image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
v25093996.3%
 
a55122.1%
 
w26771.0%
 
r14730.6%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter260601100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
v25093996.3%
 
a55122.1%
 
w26771.0%
 
r14730.6%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin260601100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
v25093996.3%
 
a55122.1%
 
w26771.0%
 
r14730.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII260601100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
v25093996.3%
 
a55122.1%
 
w26771.0%
 
r14730.6%
 

count_families
Real number (ℝ≥0)

ZEROS

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9839486418
Minimum0
Maximum9
Zeros20862
Zeros (%)8.0%
Memory size2.0 MiB
2020-12-13T17:17:08.001367image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q31
95-th percentile2
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4183889779
Coefficient of variation (CV)0.425214244
Kurtosis17.67094319
Mean0.9839486418
Median Absolute Deviation (MAD)0
Skewness1.634757873
Sum256418
Variance0.1750493368
MonotocityNot monotonic
2020-12-13T17:17:08.099065image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
122611586.8%
 
0208628.0%
 
2112944.3%
 
318020.7%
 
43890.1%
 
5104< 0.1%
 
622< 0.1%
 
77< 0.1%
 
94< 0.1%
 
82< 0.1%
 
ValueCountFrequency (%) 
0208628.0%
 
122611586.8%
 
2112944.3%
 
318020.7%
 
43890.1%
 
5104< 0.1%
 
622< 0.1%
 
77< 0.1%
 
82< 0.1%
 
94< 0.1%
 
ValueCountFrequency (%) 
94< 0.1%
 
82< 0.1%
 
77< 0.1%
 
622< 0.1%
 
5104< 0.1%
 
43890.1%
 
318020.7%
 
2112944.3%
 
122611586.8%
 
0208628.0%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
231445 
1
29156 
ValueCountFrequency (%) 
023144588.8%
 
12915611.2%
 
2020-12-13T17:17:08.228065image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
243824 
1
 
16777
ValueCountFrequency (%) 
024382493.6%
 
1167776.4%
 
2020-12-13T17:17:08.306068image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
251838 
1
 
8763
ValueCountFrequency (%) 
025183896.6%
 
187633.4%
 
2020-12-13T17:17:08.382065image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
258490 
1
 
2111
ValueCountFrequency (%) 
025849099.2%
 
121110.8%
 
2020-12-13T17:17:08.457577image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260356 
1
 
245
ValueCountFrequency (%) 
026035699.9%
 
12450.1%
 
2020-12-13T17:17:08.533577image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260507 
1
 
94
ValueCountFrequency (%) 
0260507> 99.9%
 
194< 0.1%
 
2020-12-13T17:17:08.605577image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260322 
1
 
279
ValueCountFrequency (%) 
026032299.9%
 
12790.1%
 
2020-12-13T17:17:08.682577image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260552 
1
 
49
ValueCountFrequency (%) 
0260552> 99.9%
 
149< 0.1%
 
2020-12-13T17:17:08.760576image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260563 
1
 
38
ValueCountFrequency (%) 
0260563> 99.9%
 
138< 0.1%
 
2020-12-13T17:17:08.837575image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260578 
1
 
23
ValueCountFrequency (%) 
0260578> 99.9%
 
123< 0.1%
 
2020-12-13T17:17:08.917575image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
259267 
1
 
1334
ValueCountFrequency (%) 
025926799.5%
 
113340.5%
 
2020-12-13T17:17:08.997576image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

damage_grade
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
2
148259 
3
87218 
1
25124 
ValueCountFrequency (%) 
214825956.9%
 
38721833.5%
 
1251249.6%
 
2020-12-13T17:17:09.149575image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-13T17:17:09.271576image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:17:09.420575image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters3
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
214825956.9%
 
38721833.5%
 
1251249.6%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number260601100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
214825956.9%
 
38721833.5%
 
1251249.6%
 

Most occurring scripts

ValueCountFrequency (%) 
Common260601100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
214825956.9%
 
38721833.5%
 
1251249.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII260601100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
214825956.9%
 
38721833.5%
 
1251249.6%
 

Interactions

2020-12-13T17:16:41.179963image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:41.411884image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:41.711095image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:41.908073image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:42.108073image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:42.306075image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:42.488072image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:42.679705image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:42.868664image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:43.056668image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:43.258670image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:43.455668image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:43.649667image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:43.838259image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:44.032258image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:44.216218image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:44.414708image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:44.604708image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:44.793736image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:44.987708image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:45.182754image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:45.384251image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:45.567258image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:45.764263image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:45.950275image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:46.151241image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:46.347732image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:46.559731image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:46.768796image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:46.962734image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:47.163284image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:47.352243image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:47.548269image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:47.747241image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:48.028872image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:48.225833image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:48.416433image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:48.629432image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:48.830479image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:49.044433image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:49.236432image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:49.442472image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:49.654470image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:49.859981image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:50.049968image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:50.250978image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:50.445941image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:50.634944image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:50.826980image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:50.998980image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:51.181982image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:51.354948image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:51.549498image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:51.723461image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:51.903459image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:52.091462image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:52.280497image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:52.469505image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:52.647521image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:52.853458image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:53.034505image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:53.222461image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:53.422472image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:53.598515image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:53.794467image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:54.009459image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:54.202505image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:54.383498image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:54.582503image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:54.761498image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:54.956458image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:55.151461image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:55.327508image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:55.613497image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:55.789461image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:55.970848image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:56.175845image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:56.395846image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:56.586874image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:56.792414image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:16:57.012415image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2020-12-13T17:17:09.643577image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-12-13T17:17:10.320058image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-12-13T17:17:10.917158image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-12-13T17:17:11.775132image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-12-13T17:17:12.382414image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-12-13T17:16:57.927413image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-13T17:17:00.816071image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

building_idgeo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentageland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statuscount_familieshas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_otherdamage_grade
080290664871219823065trnfqtd11000000000v1000000000003
1288308900281221087ornxqsd01000000000v1000000000002
29494721363897321055trnfxtd01000000000v1000000000003
3590882224181069421065trnfxsd01000011000v1000000000002
420194411131148833089trnfxsd10000000000v1000000000003
53330208558608921095trnfqsd01000000000v1110000000002
672845194751206622534nrnxqsd01000000000v1000000000003
747551520323122362086twqvxsu00000110000v1000000000001
84411260757721921586trqfqsd01000010000v1000000000002
99895002688699410134tinvjsd00000100000v1000000000001

Last rows

building_idgeo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentageland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statuscount_familieshas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_otherdamage_grade
26059156080520368598012553nrnfjsd01000000000v1110000000003
260592207683101382190322555trnfqsd01000010000v1000000000002
2605932264218767861325135trnfqsd01000000000v1110000000002
260594159555271811537601312trnfxjd00001000000v1000000000002
2605958270128268471822085trnfqsd01000000000v1000000000003
260596688636251335162115563nrnfjsq01000000000v1000000000002
2605976694851771520602065trnfqsd01000000000v1000000000003
2605986025121751816335567trqfqsd01000000000v1000000000003
26059915140926391851210146trxvsjd00000100000v1000000000002
260600747594219910131076nrnfqjd01000000000v3000000000003